12 research outputs found

    Learning non-linear invariants for unsupervised out-of-distribution detection

    Get PDF
    An important hurdle to overcome before machine learning models can be reliably deployed in practice is identifying when samples are different from those seen during training, as the output for unexpected samples are often confidently incorrect, while not being identifiable as such. This problem is known as out-of-distribution (OOD) detection. A popular approach for the unsupervised OOD case is to reject samples with a high Mahalanobis distance with regards to the mean features of the training data. Recent work showed that the Mahalanobis distance can be thought of as finding the training data invariants, and rejecting OOD samples that violate them. A key limitation to this approach is that it is limited to linear relations only. Here, we present a novel method capable of identifying non-linear invariants in the data. These are learned using a reversible neural network, consisting of alternating rotation and coupling layers. Results on a varied number of tasks show it to be the best method overall, and achieving state-of-the-art results on some of the experiments

    Comparison of outlier detection methods on astronomical image data

    Get PDF
    Among the many challenges posed by the huge data volumes produced by the new generation of astronomical instruments there is also the search for rare and peculiar objects. Unsupervised outlier detection algorithms may provide a viable solution. In this work we compare the performances of six methods: the Local Outlier Factor, Isolation Forest, k-means clustering, a measure of novelty, and both a normal and a convolutional autoencoder. These methods were applied to data extracted from SDSS stripe 82. After discussing the sensitivity of each method to its own set of hyperparameters, we combine the results from each method to rank the objects and produce a final list of outliers.Comment: Preprint version of the accepted manuscript to appear in the Volume "Intelligent Astrophysics" of the series "Emergence, Complexity and Computation", Book eds. I. Zelinka, D. Baron, M. Brescia, Springer Nature Switzerland, ISSN: 2194-728

    Comparison of Outlier Detection Methods on Astronomical Image Data

    Get PDF
    Among the many challenges posed by the huge data volumes produced by the new generation of astronomical instruments there is also the search for rare and peculiar objects. Unsupervised outlier detection algorithms may provide a viable solution. In this work we compare the performances of six methods: the Local Outlier Factor, Isolation Forest, k-means clustering, a measure of novelty, and both a normal and a convolutional autoencoder. These methods were applied to data extracted from SDSS stripe 82. After discussing the sensitivity of each method to its own set of hyperparameters, we combine the results from each method to rank the objects and produce a final list of outliers

    Stochastic Segmentation with Conditional Categorical Diffusion Models

    Full text link
    Semantic segmentation has made significant progress in recent years thanks to deep neural networks, but the common objective of generating a single segmentation output that accurately matches the image's content may not be suitable for safety-critical domains such as medical diagnostics and autonomous driving. Instead, multiple possible correct segmentation maps may be required to reflect the true distribution of annotation maps. In this context, stochastic semantic segmentation methods must learn to predict conditional distributions of labels given the image, but this is challenging due to the typically multimodal distributions, high-dimensional output spaces, and limited annotation data. To address these challenges, we propose a conditional categorical diffusion model (CCDM) for semantic segmentation based on Denoising Diffusion Probabilistic Models. Our model is conditioned to the input image, enabling it to generate multiple segmentation label maps that account for the aleatoric uncertainty arising from divergent ground truth annotations. Our experimental results show that CCDM achieves state-of-the-art performance on LIDC, a stochastic semantic segmentation dataset, and outperforms established baselines on the classical segmentation dataset Cityscapes.Comment: Code available at https://github.com/LarsDoorenbos/ccdm-stochastic-segmentatio

    Unsupervised out-of-distribution detection for safer robotically-guided retinal microsurgery

    Get PDF
    Purpose: A fundamental problem in designing safe machine learning systems is identifying when samples presented to a deployed model differ from those observed at training time. Detecting so-called out-of-distribution (OoD) samples is crucial in safety-critical applications such as robotically-guided retinal microsurgery, where distances between the instrument and the retina are derived from sequences of 1D images that are acquired by an instrument-integrated optical coherence tomography (iiOCT) probe. Methods: This work investigates the feasibility of using an OoD detector to identify when images from the iiOCT probe are inappropriate for subsequent machine learning-based distance estimation. We show how a simple OoD detector based on the Mahalanobis distance can successfully reject corrupted samples coming from real-world ex-vivo porcine eyes. Results: Our results demonstrate that the proposed approach can successfully detect OoD samples and help maintain the performance of the downstream task within reasonable levels. MahaAD outperformed a supervised approach trained on the same kind of corruptions and achieved the best performance in detecting OoD cases from a collection of iiOCT samples with real-world corruptions. Conclusion: The results indicate that detecting corrupted iiOCT data through OoD detection is feasible and does not need prior knowledge of possible corruptions. Consequently, MahaAD could aid in ensuring patient safety during robotically-guided microsurgery by preventing deployed prediction models from estimating distances that put the patient at risk.Comment: Accepted at IPCAI 202

    Data Invariants to Understand Unsupervised Out-of-Distribution Detection

    No full text
    Unsupervised out-of-distribution (U-OOD) detection has recently attracted much attention due to its importance in mission-critical systems and broader applicability over its supervised counterpart. Despite this increased attention, U-OOD methods suffer from important shortcomings. By performing a large-scale evaluation on different benchmarks and image modalities, we show in this work that most popular state-of-the-art methods are unable to consistently outperform a simple anomaly detector based on pre-trained features and the Mahalanobis distance (MahaAD). A key reason for the inconsistencies of these methods is the lack of a formal description of U-OOD. Motivated by a simple thought experiment, we propose a characterization of U-OOD based on the invariants of the training dataset. We show how this characterization is unknowingly embodied in the top-scoring MahaAD method, thereby explaining its quality. Furthermore, our approach can be used to interpret predictions of U-OOD detectors and provides insights into good practices for evaluating future U-OOD methods

    SS3D: Unsupervised Out-of-Distribution Detection and Localization for Medical Volumes

    No full text
    We present an extension of the self-supervised outlier detection (SSD) framework to the three-dimensional case. We first apply contrastive learning on a network using a general dataset of two-dimensional slices randomly sampled from all the available training data. This network serves as a latent embedding encoder of the input images. We model the in-distribution latent density as a multivariate Gaussian, fitted to the embeddings of the training slices. At test time, each test sample is scored by summing the Mahalanobis distances from all its slices to the means of the learned Gaussians. While mainly meant as a sample-level method, this approach additionally enables coarse localization, scoring each voxel by the minimum Mahalanobis distance among the slices that contain it. On the sample-level task of the 2021 MICCAI Medical Out-of-Distribution Analysis Challenge, our method ranked second on the challenging abdominal dataset, and fourth overall. Moreover, we show that with pretrained features and the right choice of architecture, a further boost in performance can be gained

    Generating astronomical spectra from photometry with conditional diffusion models

    No full text
    A trade-off between speed and information controls our understanding of astronomical objects. Fast-to-acquire photometric observations provide global properties, while costly and time-consuming spectroscopic measurements enable a better understanding of the physics governing their evolution. Here, we tackle this problem by generating galaxy spectra directly from photometry, through which we obtain an estimate of their intricacies from easily acquired images. This is done by using multimodal conditional diffusion models, where the best out of the generated spectra is selected with a contrastive network. Initial experiments on minimally processed SDSS data show promising results

    Stochastic Segmentation with Conditional Categorical Diffusion Models

    No full text
    Semantic segmentation has made significant progress in recent years thanks to deep neural networks, but the common objective of generating a single segmentation output that accurately matches the image's content may not be suitable for safety-critical domains such as medical diagnostics and autonomous driving. Instead, multiple possible correct segmentation maps may be required to reflect the true distribution of annotation maps. In this context, stochastic semantic segmentation methods must learn to predict conditional distributions of labels given the image, but this is challenging due to the typically multimodal distributions, high-dimensional output spaces, and limited annotation data. To address these challenges, we propose a conditional categorical diffusion model (CCDM) for semantic segmentation based on Denoising Diffusion Probabilistic Models. Our model is conditioned to the input image, enabling it to generate multiple segmentation label maps that account for the aleatoric uncertainty arising from divergent ground truth annotations. Our experimental results show that CCDM achieves state-of-the-art performance on LIDC, a stochastic semantic segmentation dataset, and outperforms established baselines on the classical segmentation dataset Cityscapes

    : A tool for one-shot sky exploration and its application for detection of active galactic nuclei

    Get PDF
    Context. Modern sky surveys are producing ever larger amounts of observational data, which makes the application of classical approaches for the classification and analysis of objects challenging and time consuming. However, this issue may be significantly mitigated by the application of automatic machine and deep learning methods. Aims. We propose uliss
    corecore